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ABSTRACT  * 

This  report  presents  a computer  model  tor  the  represen ta  t i on  of 
human  faces.  This  three-dimensional,  parametric  model  produces 
shaded  facial  images.  The  face,  constructed  of  polygonal  surfaces, 
is  manipulated  through  the  use  of  parameters  which  control 
interpolation,  translation,  rotation  and  scaling  of  the  various 
facial  features. 


Or- 

o 

Uithk+~s  model,  very  little  input  information  is  needed  to 

F 

specify  and  generate  a specific  face  with  a specific  expression. 
The  model  has  been  successfully  used  to  produce  a large  variety  of 
facial  images  and  several  animated  sequences.  The  animated 
sequences  illustrate  the  power  of  the  model  to  change  facial 
expression  and  conformation. 


Experience  with  the  model  indicates  that  fewer  than  10 
parameters  must  be  manipulated  to  produce  reasonable  speech 
synchronized  facial  animation. 

i\ 


* This  report  reproduces  a dissertation  of  the  same  title  submitted 
to  the  Computer  Science  Department,  University  of  Utah,  in  partial 
fulfullment  of  the  requirements  for  the  degree  of  Doctor  of  Philosophy. 
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CHAPTER  1 


INTRODUCTION 

• 

The  work  presented  in  this  report  is  the  continuation  and 
extent  ion  of  my  previous  research  11,21.  In  general,  it  is 
concerned  with  developing  ways  to  represent,  using  polygons,  objects 
having  flexible  surfaces.  Specifically,  this  research  has  focused 
on  the  representation  of  human  faces. 

Why  deal  with  human  faces?  There  are  several  reasons.  The  face 
presents  a challenge.  It  is  flexible  and  varies  from  person  to 
person.  He  have  a we  I I developed  facility  for  interpreting  facial 
expression.  It  is  a difficult  task  to  produce  convincing 
computer-generated  facial  images. 

Faces  have  always  been  a favorite  subject  for  artists  and 
photographers.  They  are  skilled  in  using  facial  images  to 
communicate  ideas,  emotions  and  moods.  Ue  can  view  computer 
graphics  as  an  additional  tool  or  medium  for  expression.  As  this 
medium  matures  we  would  certainly  hope  that  techniques  will  be 
developed  to  handle  objects  with  the  complexity  and  subtlety  of  the 
human  face.  Hopefully  this  work  is  a step  in  that  direction. 

This  research  relies  heavily  on  the  work  of  Uatkins  131  and 
Gouraud  [A] . UatkinB  developed  an  efficent  algorithm  for  solving 
the  visible  surface  problem  for  polygonal  objects  and  Gouraud 
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developed  a "smooth"  shading  technique  for  these  objects.  Together 
these  developments  make  it  possible  to  generate  "realistic"  images 
of  objects  modeled  with  polygonal  surfaces. 

Some  work  has  already  been  done  uith  the  two-dimensional 
representation  of  faces.  Gillenson  [5]  has  developed  an  interactive 
system  for  assisting  "non-ta I ented"  users  assemble  and  modify  facial 
images.  Chernoff  [61  has  demonstrated  the  use  of  computer-generated 
faces  as  a means  of  communicating  "mu 1 1 i -di mens i ona I " information. 

My  previous  work  showed  that  faces  could  be  sucessfully 
represented  with  polygonal  surfaces  and  that,  at  least  to  a limited 
extent,  such  faces  could  be  animated.  But  the  process  was 
difficult,  involving  the  collection  of  three-dimensional  data  for 
each  face  and  each  expression.  Ue  would  like  a model  for  faces 
which  allows  a class  or  universe  of  faces  to  be  generated.  A 
particular  face  from  this  universe  would  be  specified  by  setting  a 
number  of  parameter  values.  The  remaining  chapters  discuss  the 
development  and  application  of  such  a model. 


CHAPTER  2 


INTERPOLATION  AS  A MEANS  OF  SPECIFYING  POLYGONAL  SURFACES 

Interpolation  of  polygon  surfaces  was  the  basis  of  my  earlier 
work  with  the  representation  and  animation  of  faces  [1,23.  The 
recognition  that  interpolation  is  a good  way  to  specify  flexible 
polygonal  surfaces  was  a key  factor  in  the  research  leading  to  a 
parametric  model  for  faces.  Interpolation  has  also  been  used  to 
specify  flexible  surfaces  other  than  faces. 

2.1  The  Interpolation  Concept 


The  notion  of 

i nterpo 1 at i on 

is  quite 

simple. 

I n 

the 

one-dimensional  case, 

we  are  given 

two  values  and 

asked  to 

determ 

i ne 

an  intermediate  value. 

The  desired 

i ntermed i ate 

value  is 

spec i f 

i ed 

by  a fractional  coefficent  a. 

value  = a(value  1)  + (1-a) (value  2)  0 ^ a - 1 

This  concept  is  easily  expanded  into  more  than  one  dimension  by 
simply  applying  this  procedure  in  each  dimension.  The  idea 

generalizes  to  polygonal  surfaces  by  applying  the  scheme  to  each 
vertex  defining  the  surface.  Each  vertex  will  have  two 
three-dimensional  positions  associated  with  it.  Intermediate  forms 
of  the  surface  are  achieved  by  interpolat ing  each  vertex  between  its 


extreme  posi t ions. 


figure  2.1  - The  transition  from  a block 
letter  "H"  to  an  SR71  aircraft  using  interpolation. 
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2.2  Topology  Considerations 

For  this  technique  to  work,  the  topology  must  be  the  same  for 
both  extremes  of  the  surface.  In  other  words,  the  number  of 
vertices  defining  the  surface  and  their  interconnection  must  be 
identical  for  both  forms  of  the  surface. 

Figure  2.1  illustrates  the  interpolation  of  a polygonal 
surface.  This  figure  shous  the  transition  of  a surface  from  the 
shape  of  a block  letter  "H"  to  that  of  an  SR71  aircraft.  A topology 
was  developed  that  allowed  representat i on  of  the  airplane.  This 
topology  was  then  mapped  onto  the  block  letter  "H".  Since  both 
topologies  are  the  same,  interpolation  can  be  used  to  transform  one 
shape  into  the  other. 

2.3  Interpolation  From  Face  to  Face 

A basic  assumption  underlying  the  development  of  a parametric 
facial  model  is  that  a single  facial  topology  can  be  used.  If  the 
facial  topology  is  fixed,  manipulating  the  face  involves 
manipulating  only  the  vertex  positions. 

From  previous  work  it  was  known  that  a fixed  topology  would 
allow  a specific  face  to  change  expression.  Uou I d a single  topology 
allow  the  representation  of  a wide  range  of  faces?  Could  the 
topology  be  mapped  onto  different  faces?  Uould  the  transition 
between  faces  be  reasonable?  To  answer  these  questions  we  collected 
data  from  a number  of  faces. 
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Figure  2.2  - The  facial  topology  used  to  collect 
data  from  a number  of  real  faces.  The  small  circles 
indicate  vertices  that  lie  along  creases  in  the  face. 


(c)  (Ci) 


Figure  2.3  - The  images  (b) , (c)  and  (d)  were 
generated  using  data  obtained  from  the  plastic  model 
shown  in  (a) . The  plastic  model  was  used  as  a storage 
device  for  the  facial  topology. 
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The  first  step  in  this  process  was  to  determine  a sufficiently 
flexible  tupoluyy.  The  adopted  topology  is  shown  in  Figure  2.2. 

This  topology  was  first  applied  to  a plastic  model  of  the  human 
head.  This  model,  shown  in  Figure  2.3(a),  served  as  a storage 
device  for  the  topology.  The  model  was  used  as  a guide  each  time 
the  topology  was  applied  to  a real  face.  This  assured  that  the 
topology  would  be  identical  from  face  to  face. 

The  problem  of  collecting  data  from  real  faces  is  discussed  in 
Appendix  C.  A more  general  discussion  of  this  three-d  i mens  i ona  I 
data  collection  process  is  given  in  [71.  A different  but  related 
approach  to  this  problem  is  discussed  in  (8). 

Ue  collected  data  from  one  side  of  10  different  faces  including 
the  plastic  model.  Figures  2.3(b),  (c)  and  (d)  show  images 
generated  using  data  from  the  plastic  model.  Figure  2.3(b)  shows 
half  the  face  with  polygonal  shading.  Figure  2.3(c)  shows  a 
complete  faceted  face  generated  by  reflecting  the  data  about  the 
center  of  the  face.  Figure  2.3(d)  shows  the  same  symmetric  face 
shaded  using  Gouraud’s  smooth  shading  technique  (4). 

Using  data  from  the  various  faces,  1 made  a computer  animated 
film  showing  transitions  from  face  to  face.  This  film  demonstrated 
that,  at  least  for  the  faces  used,  a single  topology  would  allow 
representation  of  many  faces  and  the  reasonable  transition  between 
faces.  Figure  2.4  illustrates  the  transition  between  two  of  the 


faces. 


Figure  2.4  - The  transition  from  one 
face  to  another  face  using  interpolation. 


CHAPTER  3 


DEVELOPMENT  OF  THE  PARAMETRIC  MODEL 

The  parametric  model  is  based  on  data  obtained  from  the  plastic 
model  described  in  the  previous  chapter.  Manipulation  capabilities 
were  developed  to  transform  this  9tatic  data  structure  into  a 
dynamic,  parametrically  controlled  model.  These  manipulation 
capabilities  are  implemented  by  means  of  parameters  which  control 
the  interpolation,  translation,  rotation  or  scaling  of  the  various 
facial  features.  The  model  i9  symmetric.  Except  for  the  eyes,  one 
side  of  the  face  is  a mirror  image  of  the  other  side.  Figure  3.1 
gives  an  overall  view  of  the  model  structure.  For  this  model, 
interpolation  is  applied  independently  to  local  regions  of  the  face 
rather  than  globally  to  the  whole  face. 

The  parameters  controlling  the  face  are  divided  into  two  main 
categories,  those  controlling  expression  manipulation  and  those 
controlling  facial  conformation.  Only  the  eye  and  mouth  regions  are 
involved  in  expression.  The  remainder  of  this  chapter  details  the 
implementation  of  the  parameters. 

3. 1 The  Eyes 

The  first  step  in  developing  the  parametric  model  was  the 
development  of  realistic  eye9.  This  was  done  in  two  phases:  first 
the  eyeball  and  then  the  eyelid  mechanism. 
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3.1.1  The  Eyeball  Model 

The  eyeball  was  developed  as  a procedure  to  be  called  for  each 
instance  rather  than  as  a data  structure  to  be  scanned  for  each 
instance.  The  eyeball  was  first  thought  of  as  a polygonal 
hemisphere  as  shown  in  Figure  3.2.  This  hemisphere  consists  of  a 
12-sided  pupil  polygon  surrounded  by  three  rings  of  12 
quadra  I atera  I s each.  The  first  ring  is  the  iris  and  the  other  two 
form  the  white  of  the  eye.  The  decision  to  use  12  polygons  per  ring 
was  a compromise  between  eyeball  complexity  and  a desire  for  the 
pupil  and  iris  to  appear  nearly  circular. 

To  achieve  more  realistic  eyes,  I decided  to  include  a 
reflection  of  the  light  source  on  each  eye.  It  was  pointed  out  [91 
that  eye  reflections  are  almost  always  visible  over  some  portion  of 
the  iris  or  pupil.  This  is  due  to  the  fact  that  the  eyeball  is  not 
really  spherical  but  has  a smaller  partial  sphere  (the  lens) 
superimposed  on  it  as  shown  in  Figure  3.3.  Therefore,  the 
reflection  spot  is  modeled  as  a 6-sided  polygon  tangent  to  the 
surface  of  the  lens  and  free  to  move  over  the  surface  of  the  lens. 
The  exact  position  of  the  reflection  spot  depends  on  the  position 
and  orientation  of  the  eyeball  within  the  head,  and  the  positions  of 
the  light  source  and  viewer  in  relation  to  the  head. 

Real  eyes  have  the  ability  to  look  at  or  track  objects  in  their 
environment.  This  ability  is  included  in  the  model.  The 
orientation  angles  of  each  eyeball  depend  on  the  position  of  the 
eyeball  within  the  head  and  the  position  of  the  point  at  which  they 
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are  looking.  The  eyes  behave  like  normal  eyes  with  both  eyes 
looking  at  or  tracking  the  same  ooint.  The  orientation  of  each 
eyeball  is  computed  independently.  A detailed  explanation  of  the 
eye  orientation  and  reflection  spot  algorithms  is  given  in 
Appendix  A. 

A further  enhancement  of  the  eye  realism  is  obtained  by  adding 
an  additional  ring  of  polygons  to  the  eyeball.  This  ring  forms  a 
fringe  around  the  iris  as  shown  in  Figure  3.4.  Figure  3.5  shows 
halftone  renderings  of  the  eyes  produced  by  the  eyeball  procedure. 
Figure  3.5(a)  Bhoue  the  eyeballs  with  the  iris  fringe  while  Figure 
3.5(b)  shows  the  eyeballs  without  the  fringe.  In  Figure  3.5(a)  the 
relative  sizes  of  the  iris  and  pupil  are  nearly  normal. 

3.1.2  The  Eye  I i ds 

The  next  problem  is  to  "install"  the  e'eballs  in  the  face.  The 
first  attempt  was  to  fit  the  eyeballs  into  an  existing  static  face. 
This  was  done  by  estimating  the  size  and  position  of  each  eyeball 
and  then  generating  eyeballs  based  on  these  estimates. 

The  major  difficulty  encountered  is  fitting  the  eyelids  to  the 
eyeballs.  The  eyelids  are  fitted  by  computing  the  polar  coordinates 
(origin  at  the  center  of  the  eyeball)  of  each  vertex  of  the  eyelid 
polygons.  The  eyelid  vertices  are  then  mapped  onto  a sphere 
slightly  larger  than  the  eyeball  and  centered  on  it  by  setting  the 
radius  coordinate  of  each  vertex  to  1.1  times  the  eyeball  radius  and 
converting  back  to  cartesian  coordinates.  Figure  3.G  shows  halftone 


0 o 


(a) 


(b) 


Figure  3.5  - Halftone  renderings  of  the 
eyeballs  with  (a)  and  without  (b)  the  iris  fringe. 


Figure  3.6  - The  eyeballs  "fitted”  into  the  static  facial  model. 
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renderings  obtained  using  this  procedure. 

There  are  several  deficiencies  with  this  technique.  The  most 
obvious  being  that  the  eyelids  are  still  static.  Ue  want  the 
eyelids  to  open  and  close  under  parametric  control.  The  second 
deficiency  is  that  the  existing  eyelid  topology  was  not  as  detailed 
as  ue  uould  like.  Ue  uant  9ome  eyelashes,  a feu  more  polygons  in 
the  upper  lid  to  allou  it  to  follou  the  curvature  of  the  eyeball 
better,  especially  uhen  closed,  and  polygons  to  model  the  corner  of 
the  eye  and  the  lip  of  the  eyelids.  Figure  3.7  shous  the  improved 
eyelid  topology  adopted. 


Figure  3-7  - The  improved  eyelid  topology.  Vertices  within 
the  dashed  line  are  involved  in  opening  and  closing  the  eyelid. 
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With  the  improved  topology  we  are  now  ready  to  face  the  problem 
of  getting  the  eyelids  to  open  and  close.  This  problem  was  solved 
by  a combination  of  linear  interpolation  and  a variation  of  the 
spherical  mapping  idea  described  above.  In  Figure  3.7  a dashed  line 
i9  shown  enclosing  a set  of  vertices.  These  vertices  are  the  ones 
involved  in  opening  and  closing  the  eyelid.  These  vertices  are 
defined  in  only  two  dimensions,  height  and  width.  The  third 
dimension  is  obtained  by  projecting  the  vertices  back  onto  a sphere 
slightly  larger  than  and  centered  on  the  eyeball. 

Using  two  sets  of  two-dimensional  data  values  for  these  points, 
one  set  for  open  and  one  set  for  closed,  we  can  interpolate  for  data 
values  between  the  two  sets.  Projecting  these  interpolated  values 
back  onto  the  sphere  produces  the  desired  eyelid.  This  procedure, 
illustrated  in  Figure  3.8,  gives  the  dynamic,  parametrically 
controlled  eyelid  we  need.  The  parameter  controlling  the  eyelid  is 
the  one  controlling  the  two-dimensional  interpolation.  In  some 
sense  this  process  is  analogous  to  the  real  eyelid  mechanism  where 
two  membranes  are  streched  across  a spherical  surface. 

3.2  The  Eyebrows 

The  eyebrows  are  important  facial  features  for  expressing 
emotion  and  emphasis.  The  dynamic  properties  of  the  eyebrows  are 
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interpolating  between  these  values.  An  additional  parameter  allows 
horizontal  translation  of  the  inside  end  of  the  eyebrow.  This 
parameter  varies  the  separation  of  the  eyebrows  across  the  bridge  of 
the  nose. 

A table  of  parameters  affecting  the  eyes  and  eye  region  of  the 
face  is  given  in  Figure  3.9.  Figure  3.10  illustrates  the  effect  of 
varying  two  of  these  parameters:  eyelid  opening  and  eyebrow  arch. 


Parameter 


Value  Range 


Eyebrow  Arch 
Eyebrow  Separation 
Eyelid  Opening 
Eyeball  Size 
Iris  Size 

Reflection  Spot  Size 
Iris  Fringe  Size 
Pupil  Size 
Iris  Color 
Iris  Fringe  Color 
Reflection  Spot  Color 


0 - value  - 1 


-25  - value  - 25 


0 * value  - 1 


50  * value  * 100 


0 * value  * llfractlon  oC 
0 « value  * 1 J ey^beili  size 


0 - value 
0 - value 


1 fraction  of 
iris  size 


> any  valid  color 


Figure  3.9  - Parameters  affecting  the  eye  region. 


Figure  3.10  - Four  facial  images  illustrating  the 
effect  of  two  eye  parameters:  eyebrow  arch  and  eyelid  opening. 
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3.3  The  Mouth  and  Jaw 

To  achieve  a more  realistic  mouth,  several  additions  and 
changes  were  made  to  the  basis  model.  Teeth,  modeled  as  a set  of  32 
4-sided  polygons,  were  added.  An  additional  row  of  polygons  in  both 
the  upper  and  loner  lip  allows  better  curvature  of  the  lips.  An 
additional  point  at  the  corner  of  the  mouth  allows  lip  thickness  at 
the  corner. 

Jaw  rotation  is  necessary  for  the  mouth  to  assume  its  various 
speech  and  expression  positions.  Jaw  rotation  is  modeled  by 
introducing  an  axis  of  rotation  and  then  rotating  the  vertices  of 
the  jaw  about  this  axis.  The  vertices  between  the  dashed  lines  in 
Figure  3.11  are  the  vertices  affected  by  jaw  rotation.  The  axis  of 
rotation  passes  through  the  point  indicated  and  is  parallel  to  the  V 
axis.  Note  that  the  lower  lip,  lower  teeth  and  corner  of  the  mouth 
rotate  with  the  jaw.  Positive  jaw  rotation  has  the  effect  of 
opening  the  mouth. 

Initially,  all  points  of  the  lower  lip  rotated  with  the  jaw 
while  the  corner  of  the  mouth  rotated  by  an  angle  one-half  that  of 
the  jaw.  This  scheme  gives  a rather  square  looking  lower  lip  when 
the  mouth  is  open.  Rotating  the  center  lower  lip  points  with  the 
jaw  and  gradually  tapering  the  rotation  angle  of  the  other  lower  lip 
points  gives  a much  better  lower  lip.  Vertices  farther  from  the 
center  of  the  lip  are  rotated  by  smaller  angles.  The  corner  of  the 
mouth  rotates  by  one-third  the  jaw  rotation.  This  improvement, 
illustrated  in  Figure  3.12,  gives  a more  natural  oval-looking  mouth. 


Figure  3.11  - The  topology  used  for  the  parametric 
model.  The  jaw  rotation  axis  is  indicated.  The  dashed 
lines  enclose  the  vertices  effected  by  jaw  rotation. 


(a)  (b) 


Figure  3.12  - Two  faces  illustrating  the  improvement 
in  mouth  shape  achieved  by  tapering  the  effect  of  jaw 
rotation  on  the  lips.  The  original  version  is  shown  in  (a) 
and  the  improved  version  is  shown  in  (b) . 
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In  a dynamic  model  we  certainly  uant  the  mouth  to  vary  in 
expression.  A step  in  this  direction  is  to  maintain  data  for  two 
expressions.  The  mouth  can  then  be  interpolated  between  two 
expression  extremes  such  as  "smile"  and  "neutral." 

The  model  hae  several  additional  mouth  manipulation  parameters. 
A scaling  factor  controls  the  uidth  of  the  mouth.  A translation 
parameter  allous  the  lips  to  be  moved  away  from  the  front  teeth. 
The  thickness  of  the  lips  at  the  corner  of  the  mouth  may  be  varied. 
Three  translation  parameters  allow  the  corner  of  the  mouth  to  move 
in  all  three  dimensions.  And  a translation  parameter  allows  the 
lower  lip  to  be  tucked  up  under  the  upper  front  teeth.  This 
position  is  assumed  by  the  mouth  in  forming  the  sounds  "f"  and  "v". 

Later,  after  experimenting  with  speech  animation,  another  mouth 
parameter  was  added.  This  translation  parameter  allows  the  upper 
lip  to  be  raised  and  lowered.  The  effect  of  this  parameter  is 
tapered  from  the  center  of  the  I ip  to  the  corner  of  the  mouth.  The 
center  vertices  receive  full  effect  while  the  corner  vertices  are 
not  affected.  This  tapering  gives  a more  natural,  rounded 
appearance  to  the  upper  lip. 

A table  of  parameters  affecting  the  mouth  region  along  with 
their  value  ranges  is  shown  in  Figure  3.13.  Figure  3.14  illustrates 
the  effects  obtained  using  three  of  these  parameters:  jaw  rotation, 
upper  lip  position  and  mouth  expression. 
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Parameter 

Value  Range 

Jaw  Rotation 

0 - value  - 20 

Mouth  Width 

.5  - value  - 1.5 

Mouth  Expression 

0 - value  - 1 

Lip  offset  away 

0 - value  - 30 

from  the  teeth 

Width  of  the  corner 

0 * value  ^ 25 

of  the  mouth 

X,  Y and  Z displacement 

-25  - value  * 25 

of  the  corner  of  the 

mouth 

"f"  and  "v"  tuck 

-20  - value  - 0 

Upper  lip  position 

0 - value  - 20 

Figure  3.13  - Parameters  affecting  the  mouth  region. 


3.4  Conformation  Parameters 

Another  group  of  manipulation  parameters  uas  added  to  allow  the 
model  to  change  in  conformation.  Conformation  is  used  here  to  mean 
those  features  of  the  face  that  change  or  vary  from  one  individual 
to  another  as  opposed  to  features  that  vary  from  expression  to 
expression.  Some  features  change  from  expression  to  expression  as 
well  as  from  person  to  person.  Here  again,  these  conformation 
parameters  are  implemented  by  means  of  interpolation,  scaling  or 


trans I at i on. 


Fiqure  3.14  - A set  of  8 faces  illustrating 
the  effect  of  three  mouth  parameters:  jaw  rotation, 
upper  lip  position  and  mouth  expression. 
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3.4.1  Parameters  Implemented  by  Interpolation 

The  conformation  of  several  facial  regions  is  controlled  by 
interpolation.  The  forehead  may  vary  from  sloping  to  bulging.  The 
cheekbone  can  range  from  not  noticeable  to  very  pronounced.  The 
hoi  low  of  the  cheek  can  vary  from  convex  to  concave.  The  shape  of 
the  chin  and  neck  are  also  changed  using  i nterpo I at i on. 

3.4.2  Parameters  Implemented  using  Scaling 

A set  of  scaling  parameters  controls  the  scaling  of  the  entire 
face  in  each  of  the  three  dimensions.  Combinations  of  these 
parameters  determine  the  aspect  ratio  of  the  face. 

Two  scaling  parameters  affect  the  shape  of  the  eyelids.  One 
parameter  controls  the  width  of  the  eyelid  while  the  other  controls 
its  height.  Varying  the  values  of  these  parameters  controls  the 
shape  and  relative  size  of  the  eyelids. 

The  nose  is  affected  by  two  scaling  parameters.  One  controls 
the  width  of  the  bridge  of  the  nose.  The  other  determines  the  width 
of  the  lower  portion  of  the  nose  including  the  nostrils. 

Three  scale  factors  control  the  vertical  proportions  of  the 
face.  One  value  controls  the  scaling  of  the  area  from  the  chin  to 
the  mouth.  Another  value  controls  the  area  from  the  chin  to  the 
eyes.  The  third  value  controls  the  scaling  of  the  region  above  the 


eyes. 
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In  addition  there  are  scaling  parameters  affecting  the  width  of 
the  cheek  area  3nd  the  width  of  the  jaw.  The  effect  of  the  jaw 
scaling  is  tapered.  The  maximum  effect  is  applied  to  the  forward 
portion  or  point  of  the  jaw.  The  scaling  effect  diminishes  to  zero 
at  the  rear  of  the  jaw. 

3.4.3  Parameters  Implemented  using  Translation 

Several  parameters  allow  regions  of  the  face  to  be  displaced. 
It  is  possible  to  move  the  chin  both  up  and  down  as  well  as  forward 
and  backward  in  relation  to  the  rest  of  the  face.  Likewise  the 
lower  portion  of  the  nose  can  be  moved  forward,  backward,  up  or 
down.  And  finally  the  eyebrows  may  be  displaced  up  or  down. 

A table  of  conformation  parameters  and  their  value  ranges  is 
given  in  Figure  3. IS.  Figure  3.16  illustrates  the  effect  of  some 
conformation  parameters. 

3.5  Other  Parameters 

There  are  a number  of  other  parameters  which  affect  either  the 
face  or  its  environment.  These  include  the  position  of  the  eyeba I Is 
within  the  head,  where  the  eyes  are  looking,  where  the  light  source 
is  located,  where  the  viewer  is  located  and  where  he  is  looking. 
Additional  parameters  control  the  field  of  view,  the  color  (grey 
level)  of  the  various  facial  regions,  the  background  grey  level  and 
the  cosine  power  and  diffuse  factor  used  in  shading  the  face. 


Value  Range 


Parameter 


Interpolation 
Forehead 
Cheekbone 
Cheek  hollow 
Chin  shape 
Neck  shape 
Scaling 

Chin-to-mouth 
Chin-to-eye 
Eye-to-forehead 
Eyelid  X and  Z size 
Head  X,  Y and  Z scale 
Jaw  width 
Cheek  width 
Bridge  of  the  nose 
End  of  the  nose 
Translation 

X and  Z offset  for  chin 

X and  Z offset  for  the 
end  of  the  nose 

Z offset  for  the  eyebrows 


\ 0 - value  - 1 


> . 5 - value  - 1.5 


> _5o  - value  - 50 


Figure  3.15  - The  conformation  parameters. 


Figure  3.16  - Effects  achieved  using  some  of  the 
conformation  parameters.  The  initial  face  is  shown  in  (a). 
In  (b)  the  forehead  has  been  interpolated  to  a different 
shape.  For  (c)  the  neck  was  interpolated  to  a new  shape. 
The  jaw  was  scaled  by  a factor  of  .8  for  (d)  . 


(g)  (h) 


Figure  3 . 16  ( cont  inued)  - For  (e)  tine  chin-to-raouth  scale 
was  set  to  .85.  For  (f)  the  chin-to-raouth  scale  was  .9  and 
the  chin-to-eye  scale  was  .8.  In  (g)  the  vertical  scale  for 
the  head  was  increased  to  1.15.  In  (h)  the  scale  of  the  end 
of  the  nose  was  changed  from  1.0  to  .8. 


(k) 

Figure  3.16(continued)  - In  (i)  the  end  of  the  nose  was 
moved  down.  For  (3)  the  horizontal  scale  of  the  head  was  set 
to  .95  and  the  cheekbone  interpolated  to  a slightly  more 
prominent  shape.  Finally  for  (k)  the  horizontal  scale  of  the 
head  was  changed  to  1.1.  The  end  of  the  nose  was  raised  and 
scaled  by  1.1.  The  bridge  of  the  nose  was  narrowed  and  the 
eyebrows  lowered.  The  cheekbones  are  slightly  more  prominent 
and  the  cheek  hollows  are  a little  more  concave. 
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CHAPTER  4 


FACIAL  ANIMATION  USING  THE  PARAMETRIC  MODEL 


Using  the  parametric  model,  facial  animation  is  reduced  to 
varying  parameters  over  time.  The  difficult  part  of  this  task  is  to 
determine  how  the  parameters  must  vary  over  time  to  achieve  a 
desired  effect.  It  is  also  difficult  to  specify  and  coordinate  the 
usually  large  number  of  parallel  parameter  functions.  The  parallel 
specification  of  parameter  functions  is  described  below.  A subset 
of  the  first  problem,  speech  synchronized  animation,  is  discussed 
later  in  the  chapter. 


4.1  Parallel  Parameter  Function  Specification 

Uhen  using  this  model  for  animation  we  want  to  vary  several 
parameters  simultaneously  over  any  given  time  period.  Since  most 
computers  are  sequential,  we  must  introduce  some  mechanism  which 
allows  us  to  specify  these  jjar'allel  functions'  and  have  them 
translated  into  a sequential  list  of  specifications.  A program  was 
developed  which  has  as  input  a set  of  parameter  specifications  and 
has  as  output  a sequential  list  of  commands.  A parameter 
spec i f i cat i on  describes  how  the  value  of  a parameter  should  vary 
over  a given  time  interval.  The  list  of  commands  is  used  as  input 
to  the  program  implementing  the  model. 
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The  sequencing  program  accomplishes  its  task  by  storing  the 
Information  from  each  parameter  specification  and  then  searching 
this  data  for  each  frame  of  the  animated  sequence.  If  the  current 
frame  is  uithin  the  frame  range  of  a specification  then  a command  is 
generated  to  be  included  in  the  output  command  list.  The  parameter 
value  for  a given  frame  is  determined  from  the  parameter 
specification  using  the  frame  range,  parameter  value  range,  the 
current  frame  number  and  the  parameter  change  function  specified. 
The  parameter  change  functions  and  the  parameter  value  computation 
are  given  belou. 


parameter  value  = initial  value  + PCF  * DIFF 
where 

DIFF  = final  value  - initial  value 
PCF  = FF 

or  1 - COS ( FF* n ) / 2 
or  1 - C0S(FF*tt/2) 
or  SIN(FF»tt/2) 

FF  = current  frame  tt  - initial  frame  tt 
final  frame  # - initial  frame  # 


The  parameter 
change  functions 


37 


4.2  Speech  Synchronized  Facial  Animation 

Speech  animation  implies  the  ability  to  convey  emphasis  and 
emotion  in  addition  to  manipulating  the  mouth  and  lips.  Speech 
animation  is  achieved  by  manipulating  the  facial  parameters  so  that 
the  facial  expression  and  lip  motions  will  match  a spoken 
soundtrack. 

Uhat  is  required  for  lip  animation?  Madsen  in  his  book  on 
conventional  animation  [10]  indicates  the  following  capabilities  are 
required  for  lip  animation. 

a)  Open  I ips  for  the  open  vowels  a,  e and  i 

b)  closed  lips  for  the  accent  consonants  p,  b and  m 

c)  an  oval  mouth  for  u,  o and  w 

d)  the  ability  to  tuck  the  lower  lip  up  under  the  upper  front 
teeth  for  the  f and  v 

e)  and  the  ability  to  move  between  these  lip  positions  as 
requ i red. 


The  remaining  sounds  are  formed  mainly  by  the  tongue  and  do  not 
require  precise  animation.  Madsen  also  indicates  that  realistic 
characters  require  more  care  in  lip  animation  than  abstract  or 


non-rea I i st i c characters. 


A number  of  the  parameters  detailed  in  the  previous  chapter  are 
used  to  give  the  required  lip  animation  capability.  These 


parameters  are  jaw  rotation,  width  of  the  mouth,  displacement  away 
from  the  teeth,  translation  of  the  lower  lip  up  under  the  upper 


/ 
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front  teeth  and  manipulation  of  the  corner  of  the  mouth. 
Manipulation  of  the  eyebrows,  eyelids  and  the  expression  of  the 
mouth  are  used  in  conjunction  uith  these  parameters  to  convey 
emphassis  and  emotion. 

4.3  The  Production  of  Speech  Animated  Segments 

The  first  step  in  producing  speech  animated  segments  was  to 
obtain  a soundtrack.  Since  we  uished  to  compare  the  computer 
produced  animation  with  the  real  speech,  we  filmed  an  actor  reciting 
a poem.  He  was  filmed  using  what  is  known  as  "double  system".  His 
image  was  recorded  on  film  while  his  voice  was  recorded  on  magnetic 
tape.  The  end  product  of  this  procedure  is  a piece  of  movie  f i Im 
and  a piece  of  sprocketed  magnetic  film.  These  two  pieces  are 
synchronized  so  that  the  sound  corresponds  frame  for  frame  with  the 
images  on  the  movie  film. 

The  next  step  was  to  analyze  (search)  the  magnetic  film  for  the 
various  parts  of  speech  i.e.  open  vowels,  accent  consonants,  etc. 
The  end  product  of  this  process  i s *a  frame  by  frame  chart  of  the 
speech  contained  on  the  film. 

A total  of  six  computer  generated  speech  segments  were  produced 
for  use  with  this  soundtrack.  The  first  was  produced  in  the 
following  manner.  Using  the  speech  vs  frame  number  chart  and  a 
mirror,  I went  through  the  poem  mouthing  the  words.  I uatched  my 
mouth  motions  in  the  mirror  and  translated  them  into  parameter 
functione  used  to  drive  the  mod^l.  Graphs  of  the  parameter 
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functions  used  for  this  and  the  other  segments  are  included  in 
Appendix  B.  The  parameters  used  for  the  mouth  in  this  segment  were 
jaw  rotation,  mouth  expression,  translation  away  from  the  teeth, 
mouth  width,  the  f,  v tuck  and  manipulation  of  the  corner  of  the 
mouth.  In  addition  there  was  some  manipulation  of  the  eyelids  and 
the  eyebrows. 

The  sequence  generated  was  quite  disappointing.  For  one  thing, 
a fundamental  problem  with  speech  animation  was  encountered.  Often 
the  lip  motions  vary  over  very  short  time  periods.  They  can  move  at 
about  the  same  rate  as  the  film  frame  rate.  This  means  that  a 
severe  sampling  problem  occurs.  How  do  you  represent  a motion  that 
Ia9t9  only  a frame  or  two?  Thi9  segment  was  also  over  animated. 
Thi9  gave  the  impression  that  the  lips  were  moving  about  twice  as 
fast  as  the  speech  even  though  the  accent  consonants  were  properly 
9ynchron i zed. 

For  the  second  segment  all  lip  or  mouth  motion  was  deleted 
except  jaw  rotation.  The  jaw  rotation  was  "smoothed"  out.  By 
compsr  i ng  *the  jaw  rotation  graphs  for  sequences  1 ■ and  2,  this 
smoothing  can  be  seen.  The  resulting  animated  segment  was  again 
disappointing.  The  goal  of  simplifying  and  slowing  the  lip  motions 
was  achieved  but  the  motion  was  not  natural.  The  motion  did  not 
seem  to  match  the  speech  very  uell  even  though  the  accent  consonants 
were  in  proper  synchronization. 

For  the  third  sequence,  "f"  and  "v"  tucks  were  added.  This 
made  essentially  no  difference  in  the  results.  The  "f"  and  "v" 
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tucks  occur  over  such  short  intervals  that  they  are  not  visible. 

For  the  fourth  try,  the  width  of  the  mouth  was  also  varied. 
This  gave  some  improvement,  but  the  animation  was  still  not  right. 

At  this  point,  confidence  in  the  model  was  somewhat  diminished. 
Lias  the  poor  animation  due  to  the  parameter  functions  driving  the 
model  or  was  there  a basic  deficiency  in  the  model  itself?  To  answer 


this 

guest i on. 

I decide  to  ana 

1 yze. 

frame  by 

frame,  the 

film  of 

the 

ac  tor 

speak i ng. 

The  resu Its  of 

this 

ana  lysis 

would  then 

be  used 

to 

determine  the  parameter  functions  used  to  drive  the  model. 

During  the  course  of  the  analysis  it  became  apparent  that  the 
model  lacked  an  important  capability.  When  the  actor  spoke  he  used 
his  upper  lip.  The  model,  however,  did  not  allow  the  upper  lip  to 
move.  The  model  was  modified  to  include  this  capability. 

A fifth  try  at  the  speech  animation  was  made  using  the  improved 
model  and  the  results  of  the  frame  by  frame  analysis.  In  this 
sequence  the  lip  animation  was  much  more  convincing. 

For  the  sixth  try,  the  film  was  again  analyzed  frame  by  frame. 
This  second,  independent  analysis  was  used  to  drive  the  model  for 
this  final  sequence.  Much  more  attention  was  given  to  the  eyes, 
eyebrows  and  mouth  expression  in  an  effort  to  get  more  emotion  into 
the  animation.  This  final  segment  is  quite  convincing.  dost 
viewers  agree  that  it  is  at  least  on  a par  with  with  most 
conventional  speech  animation. 


CHAPTER  5 


CONCLUSIONS  AND  FUTURE  RESEARCH 

5 . 1 Cone  I us i ons 

The  parametric  model  developed  during  this  research  is 
certainly  not  "the"  model  for  faces  but  it  is  a viable  model  and  a 
good  starting  point  for  future  research. 

Uith  this  model,  feuer  than  10  parameters  are  needed  to  do  a 
reasonable  job  of  speech  animation.  The  parameters  found  most 
effective  for  facial  expression  and  lip  animation  are  listed  in 
F i gure  5.1. 


LIP  AN  I HAT  I ON  PARAMETERS 


Jaw  rotation 
Upper  lip  position 
Mouth  width 

EXPRESSION  PARAMETERS 

Mouth  expression 
Eyebrow  arch 
Eyebrow  separation 
Eye  I i d open i ng 
Pup i I si ze 
Eye  tracking 


Figure  5.1  - The  parameters  needed  for  speech  animation. 


Speech  animation  for  a model  with  this  level  of  realism  is 


difficult. 


The  faces  are  real  enough  that  the  viewer  expects  them 
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to  behave  realistically.  The  simplified  techniques  used  in  most 
character  speech  animation  are  not  good  enough.  The  "f"  and  "v" 
tucks  are  particularly  useless. 

The  conformation  parameters  of  the  model  are  not  as  easily 
evaluated  as  the  speech  and  expression  parameters.  These  parameters 
do  allow  the  conformation  of  the  face  to  change.  But,  it  is  not 
clear  exactly  what  conformation  parameters  are  desirable.  There  are 
a number  of  additional  conformation  controls  that  might  be  tried. 
Parameters  that  might  be  tried  include  shape  of  the  nose,  nostri  I 
size  and  position,  lip  shape  and  thickness,  eyebrow  shape  and 
thickness,  relative  size  and  position  of  the  face  with  relation  to 
the  head,  and  the  size  and  shape  of  the  teeth.  Experience  with  a 
model  having  a wide  range  of  conformation  control  would  be  necessary 
to  determine  the  best  set  of  conformation  parameters. 

One  deficiency  of  the  current  model  is  that  it  is  symmetric. 
Since  real  faces  are  not  symmetric,  the  ability  to  handle 
non-symmetr i c conformation  and  expressions  would  be  a significant 
improvement.  Another  deficiency  is  that  the  face  does  not  pivot  on 
the  neck.  When  people  speak,  they  tend  to  pivot  their  heads.  The 
ability  to  pivot  the  head  would  greatly  improve  the  realism  of  the 
speech  animation. 

Obviously  the  realism  of  the  images  would  be  greatly  improved 
if  the  entire  head  were  modeled  rather  than  just  the  face  and  neck. 
The  addition  of  ears,  hair  and  maybe  a tongue  would  enhance  the 
model.  The  addition  of  hair  presents  a problem,  however.  How  is 
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hair  represented  with  polygonal  surfaces? 

5.2  Future  Research 

Efficient  techniques  are  being  developed  which  use  surface 
patches  rather  than  polygons  as  a basis  for  creating  shaded  images. 
The  development  of  parametric  models  based  on  surface  patches  seems 
to  be  a promising  area  of  research. 

Automatic  speech  synchronized  animation  seems  within  reach.  A 
marr i age  of  the  current  speech  recognition  technology  and  the  speech 
animation  capability  demonstrated  by  this  research  could  lead  to 
major  resul ts. 

In  a sense  the  parametric  facial  model  is  an  instrument  we  do 
not  yet  know  how  to  play.  It  initiates  a new  area  that  might  he 
labeled  "computer  acting."  A new  set  of  skills  and  intuitions  is 
needed.  These  would  be  analogous  to  the  skills  and  intuitions  a 
conventional  animator  develops  allowing  him  to  work  effectively  with 
his  media. 

The  facial  model  might  be  useful  to  psychologists  interested  in 
expression  and  non-verbal  communication.  The  model  allows 
separation  of  expressions  and  actions  not  normally  seen 
independently  in  real  people. 

The  development  of  parametric  graphical  models  for  classes  of 
objects  is  a very  open-ended  area  of  study.  The  development  of  this 
model  for  faces  is  just  an  intial  step  into  this  area. 
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THE  EYE  TRACKING  AND  EYE 
REFLECTION  SPOT  ALGORITHMS 

The  facial  model  has  the  capability  for  the  eyes  to  look  at  a 
specified  position.  In  addition,  a reflection  of  the  light  source 
is  visible  on  the  surface  of  each  eye.  This  appendix  describes  the 
algorithms  used  to  orient  the  eyeballs  and  to  place  the  reflection 
spots  on  the  eyes. 

The  first  step  in  this  process  is  to  determine  the  orientation 
angles  for  each  eyeball.  The  eyeball  generated  by  the  eyeball 
procedure  is  centered  at  the  origin  of  its  coordinate  system  with 
the  X axis  passing  through  the  center  of  the  pupil.  The  orientation 
angles  desired  are  those  that  will  rotate  the  eyeball  so  that  it 
will  be  looking  in  the  desired  direction  when  it  is  positioned  in 
the  face. 

Refering  to  Figure  A.l  we  see  that  each  eye  has  two  orientation 
angles,  a and  6,  associated  uith  it.  The  and  for  each  eye  are 
computed  independently.  The  following  equations  are  used  to 
determine  these  angles. 

= arctan  ( (YT -Y„)/(XT-XR) ) 
a L = arctan  ( (YT -YJ / (XT-XJ ) 


Side  view 


target 
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6r  = arctan  ( (ZR-ZT) /!.„) 

6 L = arctan  ( (ZL-ZT) /Ll) 

Lr  = ((Yr-VJ%  (XT-Xw)-’)1/“ 

2 2 1/2 
Ll  = ((Yt-Yl)  + cxT-x  A/ 


The  next  step  is  to  determine  the  position  of  the  center  of  the 
lens  sphere  in  the  eyeball  coordinate  system.  Refering  to  the  upper 
left  diagram  in  Figure  A. 2 we  ses  that  the  center  of  the  lens  is 
displaced  a distance  L along  the  X axis  of  the  eyeball.  Assuming 
an  eyeball  of  unit  radius  with  the  iris  radius  R specified  as  a 
fraction  of  this  radius  then  the  distance  L is  computed  as  fol  lows. 


1 

1 


R. “+  (R.+L  )' 

1 1 c 


R.  = (R.+L 
1 j c 


2-.  1/2 

L = (1  - R.  ) - R 

C l 1 

To  find  the  true  magnitude  of  L we  need  to  multiply  the  equation 
above  by  the  radius  R of  the  eyeball. 


L 

c 


((1 


Ri  > 


1/2 


R.  ) R 
1 e 


Now  what  happens  to  the  center  of  the  lens  as  the  eyeba  I I is 
rotated  by  < and  (1  into  its  proper  orientation?  Again  refering  to 
Figure  A. 2 we  see  that  the  center  of  tho  lens  is  now  located  at  some 
position  X.,  V , Z in  the  eyeball  coordinate  system. 


48 


Z,  = L sin(-B) 

L c 

Y = I-c  cos(-B)  sin (aj 
X|  = Lc  cos(-B)  cos (a J 

The  next  step  is  to  determine  the  reflection  spot  orientation 
angles  6 and  $ for  each  eye.  When  the  reflection  spot  polygon  is 
generated  by  the  eyeball  procedure  it  is  centered  on  and 

perpend i cu I ar  to  the  X axis  of  the  eye.  It  is  a distance  R away 
from  the  center  of  the  eye  along  the  X axis.  The  angles  0 and  $ are 
used  to  rotate  the  spot  polygon  about  the  center  of  the  eyeball  into 
its  proper  orientation.  The  spot  will  then  be  displaced  along  the  X 

axis  a distance  L . These  rotations  and  displacement  will  place  the 

c 

reflection  spot  at  the  correct  position  on  the  surface  of  the  eye 
lens.  Refering  to  Figure  A. 3,  the  angles  0 and  <J>  are  computed  as 
foil ows. 

YLgt  = arctan  ( (yLg t-yLc) / l*Lst _xLc): 5 
yv  = arctan  ((yv-yLc)/lVxLc)) 

Y = (YLgt+Yv)/2 


0 

= y -a 

LLgt 

+ UL«t 

-V)2 

Lv 

II 

v; 

< 

i 

X 

Lc>2* 

(W 

)V/2 

6Lgt 

= arctan 

«ZLc 

'xe,> 

6 

V 

= arctan 

((zI.c 

-z  )/! 

V V' 

Figure  A. 3 - Reflection  spot  orientation  angles 
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6 


(V*6v)/2 


6-0 


Note  that  in  Figures  A.l,  A. 2 and  A. 3 all  angles  are  positive  in  the 
counter-clockwise  sense. 

Having  determined  all  the  orientation  angles  and  displacements, 
how  are  they  applied?  For  each  eye  the  order  of  application  is  as 
follows.  First  the  reflection  spot  polygon  is  rotated  by  i-  and  then 
by  ? about  the  center  of  the  eyeball.  It  is  then  displaced  along 
the  X axis  of  the  eyeball  by  a distance  L . Expressed  in  terms  of 
transformation  matrices  this  is 


(T  ] = [ T ] [T  1 [T  1 

spot  0 <?  L 


Next  the  reflection  spot  and  the  rest  of  the  eye  are  rotated  about 
the  center  of  the  eyeball  first  by  <*  and  then  by 


[T 


orient 


(T  ] [T  ] 

a 3 


And  finally  the  entire  eye  is  displaced  by  x , y. , z to  assume  its 
proper  position  in  the  face. 


[T 


eye 


= [T  _ ] [T,  ] 

orient  face 


The  reflection  spot  is  transformed  first  by  [T  , and  then  by 

[T  ].  The  rest  of  the  eyeball  is  transformed  only  by  IT  ]. 

eye  = eye 


APPENDIX  B 


THE  PARAMETER  VALUES  USED  FOR  THE 
SPEECH  SYNCHRONIZED  ANIMATED  SEGMENTS 

This  appendix  contains  graphs  of  the  parameter  values  U9ec)  to 
generate  six  computer  animated  segments  (see  Appendix  D)  . These 
segments  were  animated  to  match  a spoken  soundtrack.  The  soundtrack 
is  the  reading  of  a short  poem.  The  poem,  "Little  Stone"  by  Emily 
Dickinson,  is  listed  belou. 


How  happy  is  the  little  stone 
That  rambles  in  the  road  alone. 

And  never  cares  about  careers 
And  exigencies  never  fears; 

Uhose  coat  of  elemental  brown 
A passing  universe  put  on; 

And  independent  as  the  sun. 

Associates  or  glows  alone. 

Fulfilling  absolute  decree 
In  casual  simplicity. 

In  tho  following  figures  the  vertical  axes  correspond  to  the 
parameter  values  while  the  horizontal  axes  represent  time  expressed 
in  terms  of  frame  numbers.  Also  indicated  on  each  graph  is  the 
phonetic  representation  [111  of  the  soundtrack. 

The  actual  data  values  used  in  specifying  the  parameter 
functions  are  indicated  as  dots.  The  intermediate  values  were 
generated  using  the  parameter  change  functions  discussed  in 


Chapter  A 


Figure  B.l  - The  values  of  the  eyelid  opening 
parameter  for  all  six  sequences.  A value  of  zero 
corresponds  to  the  eyelid  being  closed,  a value  of 
one  corresponds  to  the  eyelid  being  wide  open. 


Figure  B.2  - The  values  of  the  eyebrow  arch  parameter 
for  sequences  1 through  5-  A value  of  0 corresponds  to  fully 
arched  eyebrows.  A value  of  1 corresponds  to  minimum  eyebrow 
arch.  In  frames  320  to  639  the  parameter  value  is  .65. 
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Figure  B.U  - The  jaw  rotation  values  used  for  sequence  1. 
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Figure  B.5  (continued) 
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Figure  E.7  - The  Jaw  rotation  values  used  for  sequence  5 
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Figure  B.T  (continued) 


Figure  B.10  - The  scale  factors  used  for  mouth  width  in  sequence  U 


Figure  E.13  - The  expression  parameter  values  used  in 
sequence  1.  A value  of  1 is  neutral  and  a value  of  0 is  a 
smile.  The  parameter  value  is  .7  for  frames  3?0  to  639. 


Figure  B.lU  - The  expression  parameter  values 
used  in  sequences  2 tlirough  5.  For  sequence  2 the 
parameter  value  is  .7  in  frames  320  to  639.  A value 
of  1 is  neutral  and  a value  of  0 is  a smile. 
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Figure  B.l6  - The  f and  v tuck  paraneter 
used  in  sequence  1.  A value  of  0 corresponds 
tuck  and  a value  of  -30  corresponds  to  a full 
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to  no 
tuck. 
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Figure  B.17  - The  f,v  tuck  parameter  values  used 
in  sequences  3 and  U.  A value  of  0 corresponds  to  no 
tuck  and  a value  of  -20  corresponds  to  a full  tuck. 
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Figure  B.18  - The  upper  lip  parameter  values  used  in 
sequence  5*  A value  of  20  corresponds  to  a fully  raised 
lip  while  a value  of  0 means  the  lip  is  not  raised  at  all. 
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Figure  B.l8( continued)  - In  frames  508  to 
516  the  solid  line  is  the  actual  function  used. 
The  dashed  line  indicates  the  intended  function. 
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Figure  B.19  - The  upper  lip  parameter  values  used  in 
sequence  6.  A value  of  20  corresponds  to  a fully  raised 
lip  whil  a value  of  0 means  the  lip  is  not  raised  at  all. 
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APPENDIX  C 


MEASURING  THREE-DIMENSIONAL  SURFACES 
WITH  A TWO-DIMENSIONAL  DATA  TABLET 


The  measurement  of  point  positions  on  a three  r|  i mens . on.*  I 
surface  is  a difficult  t ask . This  is  particularly  true  if  one  : • 
not  have  access  to  any  of  the  specialized  equipment  used  fn-  t i-i  i « 
purpose  and  especially  if  the  surface  is  flexible  and  varies  over 
time.  Outlined  be  I ou  is  a technique  that  allows  one  to  i imp  i sh 
this  task  armed  only  with  standard  photographic  techniques,  a 
two-dimensional  data  tablet  or  digitizer  and  some  computing  power  . 
The  tablet  or  digitizer  is  not  absolutely  essential  but  is  necessw  , 
for  the  method  to  be  practical. 


In  essence  the  technique  consists  of  photographing  .r  t >■  ■ 

of  interest  from  several  positions  (simultaneously  n th. 
surfaces  that  vary  over  time)  and  then  using  the  tablet  t 
information  from  these  multiple  views  of  the  .1  1.1 
information  is  then  processed  to  determine  the  *1  ' • 
position  of  points  on  the  surface. 


C.l  Theory 

A camera  can  be  viewed  as  simj  < < 1 

t fir  ee-d  i mens  1 ona  I space  into  a »u 
can  he  described  mathema'  l.i 
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techniques  [12,13]  and  the  proper  mapping  or  transformation  matrix. 


T 

T 

11 

12 

13 

T 

T 

21 

22 

23 

T 

T 

31 

32 

33 

T 

T 

41 

42 

43j 

= w [ u v 1 ] ( 1 ) 
Uhere  u-x'/u  and  v-y’/u  are  the  coordinate  values  in  the 
tuo-d i mens i ona I space  of  the  photographs  and  x,  y and  z are  the 
coordinate  values  in  the  three-dimensional  space.  Carrying  out  the 
indicated  multiplication  gives 


*v 

11 

+ 

T *V 
21  7 

+ 

T *z 
1 31 

+ 

T 

41 

= X ' = w*u 

12 

+ 

T *v 
22  y 

+ 

T *z 
1 32 

+ 

T 

42 

= y ' = W*V 

*X 

13 

+ 

T *y 
23  y 

+ 

T *z 
1 33  2 

+ 

T 

43 

= w 

Substituting  the  expression  for  w into  the  first  two  equations  above 
and  collecting  terms  gives 


(TirTi3*u)*x+(T2rT23*u)*y+(T3rT33*u)*z+(T4rT43*u)=0 

(T]2“T13*v)*x+(T22"T23*v)*y+(T32'T33*v)*z+(T42_T43*v)=°  (3) 

If  the  three-space  coordinates  x,  y,  z and  the  transf ormat  i on 
matrix  T are  known  then  these  equations  give  the  values  of  u and  v, 
the  two-dimensional  coordinates  in  the  photograph. 

If  the  transformation  matrix  T and  the  two-space  coordinates  u 
and  v are  known  then  the  equations  above  have  the  form 


a*x  + b*y  + c*z  + d = 0 


9 0 


(4) 


the  equation  of  a plane  in  three-space.  Non,  if  we  have  for  the 
same  point  a u.v  coordinate  pair  and  the  T matrix  from  another 
photographic  view  we  will  have  four  equations  in  the  three  unknowns 
x,  y and  z.  In  general  if  we  have  n views  with  n T matrices  and  n 
u.v  coordinate  pairs  we  will  have  2n  equations  in  the  three 
unknowns. 


a.  b„  c. 

X 

d_ 

Ill 

1 

a2  b2  C2 

y 

d2 

z 

• 

• 

• 

• 

a2n  b2n  C2n 

d2n 

If  a solution  exists  for  this  overdetermined  system  it  can  be 
computed  in  the  following  manner.  Multiplying  both  sides  of 
equation  (5)  by  AT  gives 


[ ATA  ][  X ] = l ATD  ] (6) 

This  system  of  three  equations  when  solved  by  standard  techniques 
gives  a least-mean-square  solution  for  x,  y and  z.  So,  if  we  have 
the  u.v  coordinates  of  a point  in  at  least  two  different 
photographic  views  and  the  correspond i ng  transformation  matrices  for 
the  views,  it  is  possible  to  solve  for  the  point’s  position  in 
three-space. 


Now,  how  does  one  determine  the  transformation  matrix  T for 


91 


each  view?  Going  back  to  equation  (1)  we  see  that  the  transformation 
matrix  contains  12  unknown  values.  But  since  we  are  dealing  with  a 
homogeneous  system,  the  matrix  will  include  an  arbitrary  scale 
factor  and  we  are  free  to  set  one  of  the  unknouns  to  any  non-zero 
value.  To  find  the  11  remaining  unknowns  we  will  need  11  equations 
of  the  form  shown  in  (3)  above.  Since  each  point  has  two  of  these 
equations  associated  with  it,  the  x,y,z  and  u,v  coordinates  for  at 
least  5 1/2  points  in  each  picture  must  be  known  in  order  to  solve 

for  T.  I f we  use  six  points  a,b,c,d,e,f  and  set  T »1  the  resulting 

43 

system  can  be  written  in  matrix  form  as  follows. 


X 

a 

ya 

z 

a 

1 

0 

0 

0 

0 

-u  *X 

a a 

-u  *y 
a ya 

-u  *z 
a a 

T 

11 

u 

a 

*b 

yb 

zb 

1 

0 

0 

0 

0 

"V’S 

““b*yb 

■“b*zb 

T 

21 

% 

X 

c 

yc 

z 

c 

1 

0 

0 

0 

0 

-u  *x 
c c 

_uc*yc 

-u  *z 
c c 

T 

31 
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If  more  than  5 1/2  points  are  available  then  multiplying  both 

m 

s i des  of  (7)  by  A 

[ ATA  ] [ T ] = [ ATB  ] (8) 

and  solving  the  resulting  system  of  11  equations  using  standard 

techniques  will  give  a least-mean-square  solution  for  the  unknown  T 
values.  For  those  interested,  an  equivalent  but  more  awkward 

approach  to  finding  the  T matrix  is  given  in  111. 

C.  2 App I i cat i on 

You  will  have  noted  from  the  above  discussion  that  this 

technique  requires  that  each  photograph  must  show  at  least  six 
"reference"  points  whose  three-dimensional  x,y,z  coordinates  are 
known.  Thi9  requirement  has  been  satisfied  by  surrounding  the 
surface  to  be  measured  with  a cube  of  known  dimensions.  The  corner 
points  of  the  cube  act  as  the  refernce  points.  A uooden  stick 
figure  cube  was  build  for  small  objects  and  a larger  cube  for  human 
faces  was  constructed  using  weighted  strings  with  beads  forming  the 
corners  of  the  cube.  See  Figure  C.l. 

Since  it  is  necessary  to  digitize  exactly  the  same  surface 
points  in  each  of  the  photographs,  some  method  of  marking  or 
identifying  the  surface  points  is  needed.  For  irregular  surfaces 
this  may  not  be  a problem  but  for  smooth  surfaces  it  was  necessary 
to  draw  a grid  of  points  on  the  surface  as  shown  in  Figure  C.l. 


Figure  C. 1 - Data  photographs. 
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C.3  Accuracy 

Errors  may  enter  the  process  in  several  ways.  Obviously  any 
error  in  determining  the  three-dimensional  position  of  the  reference 
points  will  influence  the  results.  Errors  in  measuring  the 
two-dimensional  photograph  coordinates  of  the  points  will  also 
influence  the  results.  In  addition,  computational  errors  (round 
off,  truncation,  etc.)  may  influence  the  results. 

To  minimize  the  errors  in  measuring  the  photograph  coordinates, 
the  photographs  should  be  enlarged  to  take  advantage  of  the  maximum 
tablet  resolution. 

When  solving  the  systems  of  equations  (6)  and  (8)  above,  good 
numerical  techniques  such  as  Gaussian  Elimination  with  partial 
pivoting  should  be  used. 

Some  care  should  be  taken  in  selecting  camera  positions  for  the 
various  viewB.  Every  point  of  interest  on  the  surface  must  be 
visible  in  at  least  two  views.  The  camera  positions  should  not  he 
close  to  each  other.  As  the  camera  positions  approach  each  other 
small  errors  tend  to  be  magnified. 

A detailed  error  analysis  has  not  been  done  but  experience  with 
the  method  indicates  that  with  reasonable  care  in  measuring  the 
reference  points  and  in  digitizing  the  photographs  the  total  error 
is  less  than  one  percent  of  the  reference  cube  size.  Typically  the 
error  is  .5  percent  or  .05  inches  when  using  a 10  inch  reference 
cube.  These  figures  apply  when  using  11  by  14  inch  enlargements 


Figure  C.2  - Shaded  images  generated  using  data 
extracted  from  the  data  photographs  shown  in  Figure  C. 1 . 
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digitized  on  a tablet  with  .05  inch  accuracy. 

C.4  Conclusion 

This  technique  has  been  successfully  used  to  obtain  point 
position  data  for  the  polygonal  representation  of  several  human 
faces  and  a human  body.  Figure  C.2  shows  several  halftone  images 
generated  using  point  position  data  extracted  from  the  photographs 
shown  in  Figure  C.l. 


APPENDIX  D 


THE  SPEECH  AN  I NAT  I ON  FILM 

Thi9  aopendix  consists  of  a 16mm  sound  film  titled  "Speech 
Synchronized  Computer  Generated  Facial  Animation."  The  film  contains 
the  eix  computer  generated  speech  animation  segments  discussed  in 
Chapter  A.  In  addition  it  has  a segment  of  the  actor  speaking.  The 
parameter  functions  used  to  produce  the  computer  generated  segments 
are  shoun  in  Appendix  B. 
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